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Abstract 

We generalize the technique of smoothed analysis to distributed algorithms in dynamic net¬ 
work models. Whereas standard smoothed analysis studies the impact of small random pertur¬ 
bations of input values on algorithm performance metrics, dynamic graph smoothed analysis 
studies the impact of random perturbations of the underlying changing network graph topolo¬ 
gies. Similar to the original application of smoothed analysis, our goal is to study whether 
known strong lower bounds in dynamic network models are robust or fragile: do they withstand 
small (random) perturbations, or do such deviations push the graphs far enough from a precise 
pathological instance to enable much better performance? Fragile lower bounds are likely not 
relevant for real-world deployment, while robust lower bounds represent a true difficulty caused 
by dynamic behavior. We apply this technique to three standard dynamic network problems 
with known strong worst-case lower bounds: random walks, flooding, and aggregation. We 
prove that these bounds provide a spectrum of robustness when subjected to smoothing—some 
are extremely fragile (random walks), some are moderately fragile / robust (flooding), and some 
are extremely robust (aggregation). 
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1 Introduction 


Dynamic network models describe networks with topologies that change over time (c.f., [9]). They 
are used to capture the unpredictable link behavior that characterize challenging networking sce¬ 
narios; e.g., connecting and coordinating moving vehicles, nearby smartphones, or nodes in a 
widespread and fragile overlay. Because fine-grained descriptions of link behavior in such net¬ 
works are hard to specify, most analyses of dynamic networks rely instead on a worst-case selection 
of graph changes. This property is crucial to the usefulness of these analyses, as it helps ensure the 
results persist in real deployment. 

A problem with this worst case perspective is that it often leads to extremely strong lower 
bounds. These strong results motivate a key question: Is this bound robust in the sense that it 
captures a fundamental difficulty introduced by dynamism, or is the bound fragile in the sense that 
the poor performance it describes depends on an exact sequence of adversarial changes? Fragile 
lower bounds leave open the possibility of algorithms that might still perform well in practice. By 
separating fragile from robust results in these models, therefore, we can expand the algorithmic 
tools available to those seeking useful guarantees in these challenging environments. 

In the study of traditional algorithms, an important technique for explaining why algorithms 
work well in practice, despite disappointing worst case performance, is smoothed analysis [14, 15]. 
This approach studies the expected performance of an algorithm when the inputs are slightly 
randomly perturbed. If a strong lower bound dissipates after a small amount of smoothing, it 
is considered fragile—as it depends on a carefully constructed degenerate case. Note that this is 
different from an “average-case” analysis, which looks at instances drawn from some distribution. In 
a smoothed analysis, you still begin with an adversarially chosen input, but then slightly perturb this 
choice. Of course, as the perturbation grows larger, the input converges toward random. (Indeed, in 
the original smoothed analysis papers [14, 15], the technique is described as interpolating between 
worst and average case analysis.) 

In this paper, we take the natural next step of adapting smoothed analysis to the study of 
distributed algorithms in dynamic networks. Whereas in the traditional setting smoothing typically 
perturbs numerical input values, in our setting we define smoothing to perturb the network graph 
through the random addition and deletion of edges. We claim that a lower bound for a dynamic 
network model that improves with just a small amount of graph smoothing of this type is fragile, 
as it depends on the topology evolving in an exact manner. On the other hand, a lower bound that 
persists even after substantial smoothing is robust, as this reveals a large number of similar graphs 
for which the bound holds. 

Results. We begin by providing a general definition of a dynamic network model that captures 
many of the existing models already studied in the distributed algorithms literature. At the core 
of a dynamic network model is a dynamic graph that describes the evolving network topology. 
We provide a natural definition of smoothing for a dynamic graph that is parameterized with a 
smoothing factor k G {0,1,..., ( 2 )}- more detail, to /c-smooth a dynamic graph H is to replace 
each static graph G in H with a smoothed graph G' sampled uniformly from the space of graphs 
that are: (1) within edit distance^ k of G, and (2) are allowed by the relevant dynamic network 
model (e.g., if the model requires the graph to be connected in every round, smoothing cannot 
generate a disconnected graph). 

We must next argue that these definitions allow for useful discernment between different dy- 

^The notion of edit distance we use in this paper is the number of edge additions/deletions needed to 
transform one graph to another, assuming they share the same node set. 
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Graph 

fc-Smoothed Algorithm 

fc-Smoothed Lower Bound 

0-Smoothed Lower Bound 

Flooding 

Connected 


D(n^/^/fc^/^) 

fl(n) 

Hitting Time 

Connected 

0{n^/k) 

D(n^/^/(-\/fclogn) 

D(2") 

Aggregation 

Paired 

O (n)-competitive 

r2(n)-competitive 

fl(n)-competitive 


Table 1: A summary of our main results. The columns labelled “fc-smoothed” assume A: > 0. 
Different results assume different upper bounds on k. 


namic network lower bounds. To this end, we use as case studies three well-known problems with 
strong lower bounds in dynamic network models: flooding, random walks, and aggregation. For 
each problem, we explore the robustness/fragility of the existing bound by studying how it improves 
under increasing amounts of smoothing. Our results are summarized in Table 1. We emphasize 
the surprising variety in outcomes: these results capture a wide spectrum of possible responses to 
smoothing, from quite fragile to quite robust. 

For the minimal amount of smoothing {k = 1), for example, the ^(2”) lower bound for the 
hitting time of a random walk in connected dynamic networks (established in [2]) decreases by an 
exponential factor to O(n^), the fl(n) lower bound for flooding time in these same networks (well- 
known in folklore) decreases by a polynomial factor to 0(n^/^ logn), and the D(n) lower bound on 
achievable competitive ratio for token aggregation in pairing dynamic graphs (established in [4]) 
decreases by only a constant factor. 

As we increase the smoothing factor k, our upper bound on random walk hitting time decreases 
as 0{n^/k), while our flooding upper bound reduces slower as logn/k^^^), and our aggre¬ 

gation bound remains in D(n) for k values as large as 0(n/ log^ n). In all three cases we also prove 
tight or near tight lower bounds for all studied values of k, showing that these analyses are correctly 
capturing the impact of smoothing. 

Among other insights, these results indicate that the exponential hitting time lower bound for 
dynamic walks is extremely fragile, while the impossibility of obtaining a good competitive ratio 
for dynamic aggregation is quite robust. Flooding provides an interesting intermediate case. While 
it is clear that an D(n) bound is fragile, the claim that flooding can take a polynomial amount of 
time (say, in the range to seems well-supported. 

Next Steps. The definitions and results that follow represent a first (but far from final) step 
toward the goal of adapting smoothed analysis to the dynamic network setting. There are many 
additional interesting dynamic network bounds that could be subjected to a smoothed analysis. 
Moreover, there are many other reasonable definitions of smoothing beyond the ones we use here. 
While our definition is natural and our results suggestive, for other problems or model variations 
other definitions might be more appropriate. Rather than claiming that our approach here is the 
“right” way to study the fragility of dynamic network lower bounds, we instead claim that smoothed 
analysis generally speaking (in all its various possible formulations) is an important and promising 
tool when trying to understand the fundamental limits of distributed behavior in dynamic network 
settings. 

Related Work. Smoothed analysis was introduced by Spielman and Teng [14, 15], who used the 
technique to explain why the simplex algorithm works well in practice despite strong worst-case 
lower bounds. It has been widely applied to traditional algorithm problems (see [16] for a good 
introduction and survey). Recent interest in studying distributed algorithms in dynamic networks 
was sparked by Kuhn et al. [10]. Many different problems and dynamic network models have since 
been proposed; e.g., [11, 8, 6, 3, 1, 5, 13, 7] (see [9] for a survey). The dynamic random walk lower 
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bound we study was first proposed by Avin et al. [2], while the dynamic aggregation lower bound 
we study was first proposed by Cornejo et al. [4], We note other techniques have been proposed for 
exploring the fragility of dynamic network lower bounds. In recent work, for example, Denysyuk 
et al. [5] thwart the exponential random walk lower bound due to [2] by requiring the dynamic 
graph to include a certain number of static graphs from a well-defined set, while work by Ghaffari 
et al. [7] studies the impact of adversary strength, and Newport [13] studies the impact of graph 
properties, on lower bounds in the dynamic radio network model. 

2 Dynamic Graphs, Networks, and Types 

There is no single dynamic network model. There are, instead, many different models that share 
the same basic behavior: nodes executing synchronous algorithms are connected by a network 
graph that can change from round to round. Details on how the graphs can change and how 
communication behaves given a graph differ between model types. 

In this section we provide a general definition for a dynamic network models that captures 
many existing models in the relevant literature. This approach allows us in the next section to 
define smoothing with sufficient generality that it can apply to these existing models. We note 
that in this paper we constrain our attention to oblivious graph behavior (i.e., the changing graph 
is fixed at the beginning of the execution), but that the definitions that follow generalize in a 
straightforward manner to capture adaptive models (i.e., the changing graph can adapt to behavior 
of the algorithm). 

Dynamic Graphs and Networks. Fix some node set V, where n = \V\. A dynamic graph T-L, 
defined with respect to V, is a sequence Gi,G 2 ,---, where each Gi = {V,Ei) is a graph defined 
over nodes V. If this is not an infinite sequence, then the length of % is ["Hj, the number of 
graphs in the sequence. A dynamic network, defined with respect to V, is a pair, {'H,G), where 
H is a dynamic graph, and C is a communieation rules function that maps transmission patterns 
to receive patterns. That is, the function takes as input a static graph and an assignment of 
messages to nodes, and returns an assignment of received messages to nodes. For example, in the 
classical radio network model C would specify that nodes receive a message only if exactly one of 
their neighbors transmits, while in the COCAC model G would specify that all nodes receive all 
messages sent by their neighbors. Finally, an algorithm maps process definitions to nodes in V. 

Given a dynamic network {T-L,C) and an algorithm A, an execution of A in {T-L,G) proceeds 
as follows: for each round r, nodes use their process definition according to A to determine their 
transmission behavior, and the resulting receive behavior is determined by applying C to T-Llr] and 
this transmission pattern. 

Dynamic Network Types. When we think of a dynamic network model suitable for running 
executions of distributed algorithms, what we really mean is a combination of a description of how 
communication works, and a set of the different dynamic graphs we might encounter. We formalize 
this notion with the concept of the dynamic network type, which we define as a pair (Q,G), where 
^ is a set of dynamic graphs and G is a communication rules function. For each Td ^ Q, we say 
dynamic network type {G,G) contains the dynamic network {Td,G). 

When proving an upper bound result, we will typically show that the result holds when our 
algorithm is executed in any dynamic network contained within a given type. When proving a 
lower bound result, we will typically show that there exists a dynamic network contained within 
the relevant type for which the result holds. In this paper, we will define and analyze two existing 
dynamic network types: (1-interval) connected networks [10, 11, 8, 6], in which the graph in each 
round is connected and G describes reliable broadcast to neighbors in the graph, and pairing 
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networks [4], in which the graph in each round is a matching and C describes reliable message 
passing with each node’s neighbor (if any). 

3 Smoothing Dynamic Graphs 

We now define a version of smoothed analysis that is relevant to dynamic graphs. To begin, we 
define the edit distance between two static graphs G = {V, E) and G' = {V, E') to be the minimum 
number of edge additions and removals needed to transform G to G'. With this in mind, for a 
given G and k € {0,1,..., ( 2 )}) we defined the set: 

editdist(G, k) = {G' \ the edit distance between G and G' is no more than k}. 

Finally, for a given set of dynamic graphs Q, we define the set: 

allowed{Q) = {G \ 371 G Q such that G G %}. 

In other words, allowed describes all graphs that show up in the dynamic graphs contained in the 
set Q. Our notion of smoothing is always defined with respect to a dynamic graph set Q. Formally: 

Definition 3.1. Fix a set of dynamic graphs Q, a dynamic graph FL G Q, and smoothing factor 
k G {0,1,..., ( 2 )}- To A:-smooth a static graph G G FL (with respect to Q) is to replace G with 
a graph G' sampled uniformly from the set editdist{G,k) n allowed{Q). To fc-smooth the entire 
dynamic graph Fi (with respect to Q), is to replace Fi with the dynamic graph FI' that results when 
we k-smooth each of its static graphs. 

We will also sometimes say that G' (resp. FL') is a k-smoothed version of G (resp. %), or simply 
a k-smoothed G (resp. FL). We often omit the dynamic graph set Q when it is clear in context. 
(Typically, Q will be the set contained in a dynamic network type under consideration.) 

Discussion. Our notion of A:-smoothing transforms a graph by randomly adding or deleting k 
edges. A key piece of our definition is that smoothing a graph with respect to a dynamic graph 
set cannot produce a graph not found in any members of that set. This restriction is particularly 
important for proving lower bounds on smoothed graphs, as we want to make sure that the lower 
bound results does not rely on a dynamic graph that could not otherwise appear. For example, 
if studying a process in a dynamic graph that is always connected, we do not want smoothing to 
disconnect the graph—an event that might trivialize some bounds. 

4 Connected and Pairing Dynamic Network Types 

This section defines the two dynamic network types studied in this paper: the connected network 
type [10, 11, 8, 6], and the pairing network type [4]. We study random walks (Section 6) and flooding 
(Section 5) in the context of the connected network type, whereas we study token aggregation 
(Section 7) in the context of the pairing type. 

4.1 Connected Network 

The connected network type [10, 11, 8, 6] is defined as {GconniGconn), where Gcxmn contains every 
dynamic graph (defined with respect to our fixed node set V) in which every individual graph is 
connected, and where Gconn describes reliable broadcast (i.e., a message sent by u in rounds r in 
an execution in graph FL is received by every neighbor of u in FL[r]). 
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Properties of Smoothed Connected Networks. For our upper bounds, we show that if 
certain edges are added to the graph through smoothing, then the algorithm makes enough progress 
on the smoothed graph. For our lower bounds, we show that if certain edges are not added to 
the graph, then the algorithm does not make much progress. The following lemmas bound the 
probabilities that these edges are added. The proofs roughly amount to showing that sampling 
uniformly from editdist{G, k) Fl allowed{Qconn) is similar to sampling from editdist{G, k). 

The first two lemmas are applicable when upper-bounding the performance of an algorithm 
on a smoothed dynamic graph. The first lemma states that the fc-smoothed version of graph G 
is fairly likely to include at least one edge from the set S of helpful edges. The second lemma, 
conversely, says that certain critical edges that already exist in G are very unlikely to be removed 
in the smoothed version. 


Lemma 4.1. There exists constant ci > 0 such that the following holds. Consider any graph 
G G allowed{Qconn)- Consider also any nonempty set S of potential edges and smoothing value 
k < n/16 with k [S'! < n^/2. Then with probability at least cik IS"! /n^, the k-smoothed graph G' of 
G contains at least one edge from S. 

Proof. We start by noting that sampling a graph Gn uniformly from editdist{G, k) and then re¬ 
jecting (and retrying) if Gd 0 allowed{Qconn) is equivalent to sampling uniformly from /c-smoothed 
graphs. Let A be the event that Go includes an edge from S. Let B be the event that Gd is 
connected. Our goal is to lower-bound Pr[A|i?] starting from Pr[^|i?] > Pr[y4 and B]. 

The proof consists of two cases, depending on whether the input graph G = (P, E) contains 
an edge from S. For the first case, suppose that it does. Choose an arbitrary edge e G S Ci E, 
and choose an arbitrary spanning tree T or G. Let Btc be the event that Gd contains all edges 
in T U {e}. Note that if Btc occurs, then Gd is both connected and contains an edge from S. 
Thus, Pr[^ and B] > Pic[BTe], and we need only bound PtIBtc]- Sampling a graph Gd from 
editdist{G, k) is equivalent to selecting up to k random edges from among all potential ( 2 ) edges 
and toggling their status. Consider the ith edge toggled through this process. The probability that 
the edge is one of the n edges in T or e is at most 2 n/( 2 ), where the loose factor of 2 arises from the 
fact that the ith edge is only selected from among the ( 2 ) — i remaining edges (and ( 2 ) — * > ( 2 ) /2 
for k < ( 2 )/2). Taking a union bound over at most k edge choices, we have Pr[not Btc] < 2kn/ ( 2 ). 
Thus Pr[i?re] > 1 — 2kn/ ( 2 ) > 1/2 for fe < n/16. 

The second case is that S D E = (Ji. As before, choose any arbitrary spanning tree T in G. 
Let Bt be the event that T is in Gd- Since Bt C B, we have Pr[A and B] > Pr[A and Bt] = 
Pr[A|i? 7 ’] Pr)!?^] > Pi[A\Bt\/2^ where the inequality follows from the argument of the first case. 
Since S and T are disjoint, the probability of selecting at least one edge from S from among the 
potential ( 2 ) — |T| edges not including T is higher than the probability of selecting at least one edge 
from S from among all potential ( 2 ) edges. We thus have Pr[A and B] > Pv[A\Bt\/2 > Pr[A]/2. 

To complete the second case, we next lower-bound Pr[A]. Consider the process of sampling Gd 
by toggling up to k edges. The number of possible graphs to sample from is thus Yl’i=o . 

For k < ( 2)72 (which follows from k < n/2), the terms in the summation are strictly increasing. So 

Yli=\k/ 2 '] > ( 1 / 2 ) Yli=o } be., with probability at least 1/2 we toggle at least k/2 

edges. If we choose (at least) k/2 random edges, the probability that none of them is in S is at most 
\ k/2 / \ 

1 — tIv I < I 1 — ) following from the lemma assumption that the right-hand side is at 


(2) 


(2) 


least 1/2.^ Hence, when choosing at least /c/2 edges, the probability of selecting at least one edge 

^The inequality can easily be proven by induction over the exponent: assuming the product so far satishes 
p> 1/2, we have p(l — x) < p — x/2. 
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from S'is at least (A;/4) IS*! 7 ( 2 )• We conclude that Pr[A] > (l/2)(fc/4) |S| 7 ( 2 ) > (178)A: |S| 7?T'^- □ 

Lemma 4.2. There exists constant C 2 > 0 such that the following holds. Consider any graph 
G = {V,E) G allowed{Gconn)- Consider also any nonempty set S C E of edges in the graph and 
smoothing value k < n7l6. Then with probability at most C 2 k\S\ /n'^, the k-smoothed graph G' 
removes an edge from S. 

Proof. As in the proof of Lemma 4.1, consider a graph G^ sampled from editdist{G, k), and let B 
be the event that Gd remains connected. Let Ar be the event that an edge from S is deleted. We 
want to upper bound Pr[A/j|i?] = Pr[Aij and B]/Px[B] < Pr[A/j]7Pr[i?]. As long as Pr[i?] > I 72 , 
we have Pr[Aj:j|i?] < 2Pr[A/j]. (Proving that Pr[i?] > I 72 is virtually identical to the second case 
in proof of Lemma 4.1.) 

We now upper-bound Pr[A/j]. For Ar to occur, an edge from S must be toggled. So we can 
upper-bound Pr[A/j] using a union bound over the at most k edge selections. In particular, the ith 
edge selected belongs to S with probability at most 2 |S| 7 ( 2 ) (by the same argument as case 2 of 
Lemma 4.1), giving Pr[Aij] < 2k |S| 7 ( 2 )- 

Our next lemma is applicable when lower-bounding an algorithm’s performance on a dynamic 
graph. It says essentially that Lemma 4.1 is tight—it is not too likely to add any of the helpful 
edges from S. 

Lemma 4.3. There exists constant C 3 > 0 such that the following holds. Consider any graph 
G = {V, E) G allowed{Gconn) ■ Consider also any set S of edges and smoothing value k < n/lG such 
that S r\ E = lb. Then with probability at most c^k IS"! /n'^, the k-smoothed graph G' of G contains 
an edge from S. 

Proof. This proof is identical to the proof of Lemma 4.2, with the event Aji replaced by the event 
A that at least one edge from S is added. (In either case, the event that at least one edge from 
S is toggled hy Gd- Here, it is important that S Cl E = ib so that toggling is required to yield the 
edge in G".) □ 

4.2 Pairing Network 

The second type we study is the pairing network type [4]. This type is defined as {GpairiCpair), 
where Gpair contains every dynamic graph (defined with respect to our fixed node set V) in which 
every individual graph is a (not necessarily complete) matching, and Gpair reliable communicates 
messages between pairs of nodes connected in the given round. This network type is motivated 
by the current peer-to-peer network technologies implemented in smart devices. These low-level 
protocols usually depend on discovering nearby nodes and initiating one-on-one local interaction. 

Properties of Smoothed Pairing Networks. In the following, when discussing a matching 
G, we partition nodes into one of two types: a node is matched if it is connected to another node 
by an edge in G, and it is otherwise unmatched. The following property concerns the probability 
that smoothing affects (i.e., adds or deletes at least one adjacent edge) a given node u from a set 
S of nodes of the same type. It notes that as the set S containing u grows, the upper bound on 
the probability that u is affected decreases. The key insight behind this not necessarily intuitive 
statement is that this probability must be the same for all nodes in S (due to their symmetry in 
the graph). Therefore, a given probability will generate more expected changes as S grows, and 
therefore, to keep the expected changes below the k threshold, this bound on this probability must 
decrease as S grows. 
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Lemma 4.4. Consider any graph G = {V,E) G allowed{Qpair) and constant (^ > 1. Let S C V be 
a set of nodes in G such that: (1) all nodes in S are of the same type (matched or unmatched), and 
(2) 151 > n/5. Consider any node u G S and smoothing factor k < nf{2 ■ <5). Let G' he the result 
of k-smoothing G. The probability that u’s adjacency list is different in G' as compared to G is no 
more than {2 ■ 6 ■ k)/n. 

Proof. For a given w G S, let pw be the probably that re’s adjacency list is different in G' as 
compared to G. Assume for contradiction that pu > (2 ■ 6 ■ k)/n. We first note that all nodes in S 
are of the same type and therefore their probability of being changed must be symmetric. Formally, 
for every u,v G S: Pu = Pv 

For each w G S, define to be the indicator random variable that is 1 if w’s adjacency list 
changes in a particular selection of G', and otherwise evaluates to 0. Let Y = Ylwes be the 
total number of nodes in S that end up changed in Gk Leveraging the symmetry identified above, 
we can lower bound this expectation: 

E(y) = E(^ X^) = E(X^) = = 

uiSS wGS wGS wGS 

Let Z be the constant random variable that always evaluates to 2k. We can interpret Z as an 
upper bound on the total number of nodes that are affected by changes occurring in G'. (By the 
dehnition of /c-smoothing there are at most k edge changes, and each change affects two nodes.) 
Because Y counts the nodes affected by changes in G', it follows that in all trials Y < Z. The 
monotonicity property of expectation therefore provides that E(y) < M{Z) = 2k. We established 
above, however, that under our assumed pu bound that E(y) > 2k. This contradicts our assumption 
that pu> {2 ■ 5 ■ k)/n. □ 

5 Flooding 

Here we consider the performance of a basic flooding process in a connected dynamic network. In 
more detail, we assume a single source node starts with a message. In every round, every node that 
knows the message broadcasts the message to its neighbors. (Flooding can be trivially implemented 
in a connected network type due to reliable communication.) We consider the flooding process 
complete in the first round that every node has the message. Without smoothing, this problem 
clearly takes H(n) rounds in large diameter static graphs, so a natural alternative is to state bounds 
in terms of diameter. Unfortunately, there exist dynamic graphs (e.g., the spooling graph defined 
below) where the graph in each round is constant diameter, but flooding still requires H(re) rounds. 

We show that this H(re) lower bound is somewhat fragile by proving a polynomial improvement 
with any smoothing. Specifically, we show an upper bound of 0(n^/^ log(n)/A:^/^) rounds, with 
high probability, with /c-smoothing. We also exhibit a nearly matching lower bound by showing 
that the dynamic spooling graph requires /k}/^) rounds with constant probability. 

5.1 Lower Bound 

We build our lower bound around the dynamic spooling graph, defined as follows. Label the nodes 
from 1 to n, where node 1 is the source. The spooling graph is a dynamic graph where in each 
round r, the network is the min {r,n — l}-spool graph. We define the i-spool graph, for i G [n — 1] 
to be the graph consisting of; a star on nodes {1 ,..., i} centered at i called the left spool, a star on 
nodes {i + 1 ,..., n} centered on i + 1 called the right spool, and an edge between the two centers i 
and i + 1. We call i + 1 the head node. 
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With node 1 as the source node, it is straightforward to see that, in the absence of smoothing, 
flooding requires n — 1 rounds to complete on the spooling network. (Every node in the left spool 
has the message but every node in the right spool does not. In each round, the head node receives 
the message then moves to the left spool.) We generalize this lower bound to smoothing. The main 
idea is that in order for every node to receive the message early, one of the early heads must be 
adjacent to a smoothed edge. 

Theorem 5.1. Consider the flooding process on a k-smoothed n-vertex spooling graph, with k < y/n 
and sufficiently large n. With probability at least 1/2, the flooding process does not complete before 
the /kf/^)-th round. 

Proof. Consider the first R—l = jlfll'^ rounds of flooding on the spooling graph, where <5 < 1 
is a constant to be tuned later. Let C be the set of nodes with ids from 1 to R. These are the 
nodes that will play the role of head node during one of the R — 1 rounds. Our goal is to argue 
that, with constant probability, there is at least one node that does not receive the message during 
these rounds. 

There are three ways a node can learn the message during these R — 1 rounds: (1) it is the 
head and learns the message from the center of the left spool; (2) it is in the right spool and there 
is a smoothed edge between this node and a node with the message; (3) it is in the right spool and 
the current head node already has the message. Case (1) is the easiest—the nodes C receive the 
message this way. We next bound the other two cases. We say that we fail case (2) if more than 
R nodes receive the message by case (2). We say that we fail case (3) if case (3) ever occurs in the 
first R rounds. We shall show that either case fails with probability at most 1/4, and hence the 
probability of more than 2R nodes receiving the message is at most 1/2. 

We first bound the probability of case (3) occurring. This case can only happen if one of the 
nodes in C received the message early due to a smoothed edge. Consider each round r during 
which we have not yet failed cases (2) or (3): i.e., at most 2R nodes in total have the message, 
but none of the nodes with ids between r + 1 and R have it. From Lemma 4.3, the probability of 
adding an edge between a node with the message and one of the relevant heads (those nodes in C) 
is < C3k{2R)\C\/n'^ = 0{kR^/n^) = 0{b‘^kf/‘^/n^/^). Taking a union bound over all R rounds, we 
get that the probability of failing case (3) is at most 0{b'^Rk ^^^= O(b^). For small enough 
b, this is at most 1/4. 

We next argue that for small enough constant b, we fail case (2) with probability at most 1/4. 
To prove the claim, consider a round r < R and suppose that we have not yet failed cases (2) or 
(3). Thus at most 2R nodes have the message, and the probability that a specific node receives 
the message by case (2) in round r is at most C3k(2R)/n‘^ = 0{kR/n^) by Lemma 4.3. Thus, by 
linearity of expectation, the expected number of nodes receiving the message in round r is 0{kR/n). 
Summing over all i? — 1 rounds, the expected total number of nodes that learn the message this way 
is 0{kR^/n) = 0{k{bn^/^/k^/^f /n) = 0{b‘^n^/^k^/^) = OibRfl^/^/n^/^) = 0{bR) for k = 0{./n). 
Thus, for b small enough, the expected total number of nodes that receive the message by case (2) 
is R/4:. Applying Markov’s inequality, the probability that more than R nodes receive the message 
is at most 1/4. We thus conclude that we fail case (2) with probability at most 1/4. □ 

5.2 An Upper Bound for General Networks 

We now show that flooding in every /^-smoothed network will complete in 0(n^/^ log n/A:^/^) time, 
with high probability. When combined with the /kf/^) lower bound from above, this shows 

this analysis to be essentially tight for this problem under smoothing. 


Support Sequences. The core idea is to show that every node in every network is supported 
by a structure in the dynamic graph such that if the message can be delivered to anywhere in 
this structure in time, it will subsequently propagate to the target. In the spooling network, this 
structure for a given target node u consists simply of the nodes that will become the head in the 
rounds leading up to the relevant complexity deadline. The support sequence object defined below 
generalizes a similar notion to all graphs. It provides, in some sense, a fat target for the smoothed 
edges to hit in their quest to accelerate flooding. 

Definition 5.2. Fix two integers t and i, 1 < i < t, a dynamic graph 7i = with 

Gi = {V, Ei) for all i, and a node u £ V. A (t, £)-support sequence for u in G is a sequence 
So, Si, S 2 , ■■■, Si, such that the following properties hold: (1) For every i G [0,.^]: Si C V. (2) 
So = {tt}. (3) For every i G [1,^].' Si-i C Si and Si \ Si-i = {u}, for some v G V such that v is 
adjacent to at least one node of Si-i in Gt-i- 

Notice that the support structure is defined “backwards” with So containing the target node u, 
and each subsequent step going one round back in time. We prove that every connected dynamic 
graph has such a support structure, because the graph is connected in every round. 

Lemma 5.3. Fix some dynamic graph j-i G Gconn on vertex set V, some node u £ V, and some 
rounds t and i, where 1 < i <t. There exists a {t,i)-support sequence for u in TL. 

Proof. We can iteratively construct the desired support sequence. The key observation to this 
procedure is the following: given any Si C V and static connected graph G over V, there exists 
some V £ V \ Si that is connected to at least one node in Si. This follows because the absence 
of such an edge would imply that the subgraph induced by Si is a disconnected component in G, 
contradicting the assumption that it is connected. With this property established, constructing 
the (t, .^)-support sequence is straightforward: start with So = {n} and apply the above procedure 
again and again to the graph at times t — l,t — 2, ...,t — i to define Si, S 2 ,..., Si as required by the 
definition of a support sequence. □ 

The following key lemma shows that over every period of rounds of fe-smoothed 

flooding, every node has a constant probability of receiving the message. Applying this lemma over 
0(logn) consecutive time intervals with a Chernoff bound, we get our main theorem. 

Lemma 5.4. There exists constant a > 3 such that the following holds. Fix a dynamic graph 
Ti £ Gconn on vertex set V, any node u £V, and a consecutive interval of ar?!"^ rounds. For 
smoothing value k < n/16, node u receives the flooded message in the k-smoothed version of PL with 
probability at least 1/2. 

Proof. Let t = ar?!"^ jkf^'^, and let I = r?!'^ jkfl'^. Let S = So, Si,..., Sihe a {t, I'j-support sequence 
for u in G. (By Lemma 5.3, we know such a structure exists.) The key claim in this proof is the 
following: 

(*) If a node in Si receives the broadcast message by round t — i, then u receives the broadcast 
message by round t with probability at least 3/4. 

To establish this claim we first introduce some further notation. Let Vi, for i £[P\, be the single 
node in Si \ Si-i, and let vo = u. 

We will show by (reverse) induction the following invariant: for every i £ [0,£], by the beginning 
of round t — i there is some node Vj with j < i that has received the broadcast message. For i = 0, 
this implies that node u = vq has received the message and we are done. The base case, where 
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i = i, follows by the assumption that some node in Si receives the broadcast message prior to 
round t — i. 

Assume that the invariant holds for i + 1; we will determine the probability that it holds for 
i. That is, by the beginning of round t — {i + 1), there is some node Vj for j < z + 1 that has 

received the broadcast message. If j < i, then the invariant is already true for i. Assume, then, 

that j = i + 1, i.e., Uj+i has received the message by the beginning of round t — i — 1. By the 
definition of the (t, .^)-support sequence, node Uj+i is connected by an edge e to some node in Si, 
i.e., to some node Vj' for f < i. Thus if the specified edge e is not removed by smoothing, then by 
the beginning of round t — i, there is some node Vj' for f < i that has received the message. 

The probability that edge e is removed by smoothing is at most C 2 k/'n?‘ (by Lemma 4.2), for 
some constant C 2 - By taking a union bound over the I = steps of the induction argument, 

we see that the claim fails with probability at most <1/4 for k < n and sufficiently 

large n. This proves the claim. 

Now that we have established the value of getting a message to by round t — i we are left 
to calculate the probability that this event occurs. To do so, we first note that after 
rounds, there exists a set T containing at least jk^^^ nodes that have the message. This follows 
because at least one new node must receive the message after every round (due to the assumption 
that 1~L G Qconn)' 

There are therefore |T||S'£| possible edges between T and Si, and if any of those edges is added 
via smoothing after round and before round {a — jk^^^ , then the target u has a 

good chance of receiving the message. In each such round, by Lemma 4.1, such an edge addition 
occurs with probability at least ciklTWSil/n? > cxk^!"^. 

Thus the probability that the message does not get to a node in Si during any of these {a — 
fk^^^ rounds is (1—For a proper choice of a, this shows 
that such an edge is added in at least one round between time jk}-!"^ and time (a — 
with probability at least 3/4, and thus by time t — lai least one node in Si has received the message. 

Putting the pieces together, we look at the probability of both events happening: the message 
reaches Si by round t — I, and the message propagates successfully through the support struc¬ 
ture. Summing the probabilities of error, we see that node u receives the message by time t with 
probability at least 1/2. □ 

Theorem 5.5. For any dynamic graph TL G Qconn cind smoothing value k < n/16, flooding com¬ 
pletes in 0(n^/^ logn/Zc^/^) rounds on the k-smoothed version ofTi with high probability. 

Proof. Fix a non-source node u. We know via Lemma 5.4 that in each time interval of length 
0(n^/^//c^/^), node u receives the message with probability at least 1/2. Thus, over 0(logn) such 
intervals, u receives the message with high probability. A union bound of the n — 1 non-source 
nodes yields the final result. □ 

6 Random Walks 

As discussed in Section 1, random walks in dynamic graphs exhibit fundamentally different behavior 
from random walks in static graphs. Most notably, in dynamic graphs there can be pairs of nodes 
whose hitting time is exponential [2], even though in static (connected) graphs it is well-known 
that the maximum hitting time is at most O(n^) [12]. This is true even under obvious technical 
restrictions necessary to prevent infinite hitting times, such as requiring the graph to be connected 
at all times and to have self-loops at all nodes. 
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We show that this lower bound is extremely fragile. A very simple argument shows that a small 
perturbation (1-smoothing) is enough to guarantee that in any dynamic graph, all hitting times are 
at most 0{n^). Larger perturbations (^-smoothing) lead to 0{n^/k) hitting times. We also prove 
a lower bound of using an example which is in fact a static graph (made dynamic by 

simply repeating it). 

6.1 Preliminaries 

We begin with some technical preliminaries. In a static graph, a random walk starting at u & V 
is a walk on G where the next node is chosen uniformly at random from the set of neighbors on 
the current node (possibly including the current node itself if there is a self-loop). The hitting time 
H (u, v) for u,v & V is the expected number of steps taken by a random walk starting at u until it 
hits V for the first time. Random walks are defined similarly in a dynamic graph T-L = Gi,G 2 , ■ ■ ■ ■ 
at first the random walk starts at u, and if at the beginning of time step t it is at a node vt then 
in step t it moves to a neighbor of vt in Gt chosen uniformly at random. Hitting times are defined 
in the same way as in the static case. 

The definition of the hitting time in a smoothed dynamic graph is intuitive but slightly subtle. 
Given a dynamic graph T-L and vertices u, v, the hitting time from u to v under k-smoothing, 
denoted by Hk{u,v), is the expected number of steps taken by a random walk starting at u until 
first reaching v in the (random) fe-smoothed version Ti' of Ti (either with respect to Qconn or with 
respect to the set Qaii of all dynamic graphs). Note that this expectation is now taken over two 
independent sources of randomness: the randomness of the random walk, and also the randomness 
of the smoothing (as defined in Section 3). 

6.2 Upper Bounds 

We first prove that even a tiny amount amount of smoothing is sufficient to guarantee polynomial 
hitting times even though without smoothing there is an exponential lower bound. Intuitively, this 
is because if we add a random edge at every time point, there is always some inverse polynomial 
probability of directly jumping to the target node. We also show that more smoothing decreases 
this bound linearly. 

Theorem 6.1. In any dynamic graph Ti, for all vertices u,v and value k < n/16, the hitting 
time Hk{u,v) under k-smoothing (with respect to Gall) is at most 0{n^/k). This is also true for 
smoothing with respect to Gconn ifT~L^ Gconn ■ 

Proof. Consider some time step t, and suppose that the random walk is at some node w. If {rc, r} is 
an edge in Gt (the graph at time t), then the probability that it remains an edge under ^-smoothing 
is at least H(l) (this is direct for smoothing with respect to Gaii, or from Lemma 4.1 for smoothing 
with respect to Gconn)- If {w,v} is not an edge in Gt, then the probability that {w,v} exists due 
to smoothing is at least Gl[k/n‘^) (again, either directly or from Lemma 4.3). In either case, if this 
edge does exist, the probability that the random walk takes it is at least 1/re. So the probability 
that the next node in the walk is v is at least Q{k/n^). Thus at every time step the probability that 
the next node in the walk is v is Gl{k/n^), so the expected time until the walk hits v is 0{n^/k). □ 

A particularly interesting example is the dynamic star, which was used by Avin et al. [2] to 
prove an exponential lower bound. The dynamic star consists of re vertices {0,1,... , re — 1}, where 
the center of the start at time t is t mod (re — 1) (note that node re — 1 is never the center). Every 
node also has a self loop. Avin et al. [2] proved that the hitting time from node re — 2 to node 
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n — 1 is at least 2”'“^. It turns out that this lower bound is particularly fragile - not only does 
Theorem 6.1 imply that the hitting time is polynomial, it is actually a factor of n better than the 
global upper bound due to the small degrees at the leaves. 

Theorem 6.2. Hk{u,v) is at most 0{n‘^/k) in the dynamic star for all k < n/16 and for all 
vertices u,v (where smoothing is with respect to Qconn)- 

Proof. Suppose that the random walk is at some node w at time t. If w is the center of the star 
at time t, then with constant probability {w, v} is not removed by smoothing and with probability 
II(l/n) the random walk follows the edge {tc, v}. If u; is a leaf, then by Lemma 4.1, the probability 
that the edge {w,v} exists in the smoothed graph is at least Pl{k/n^). On the other hand, it is 
straightforward to see that with constant probability no other edge incident on w is added (this is 
direct from Lemma 4.3 if A: = o(n), but for the dynamic star continues to hold up to A: = n). If this 
happens, then the degree of w is 0(1) and thus the probability that the walk follows the randomly 
added edge {rc,u} is Cl{k/n^). Thus in every time step the probability that the walk moves to v is 
Q{k/n‘^), and thus the hitting time is at most 0{n‘^jk). □ 

6.3 Lower Bounds 

Since the dynamic star was the worst example for random walks in dynamic graphs without smooth¬ 
ing, Theorem 6.2 naturally leads to the question of whether the bound of 0{n‘^/k) holds for all 
dynamic graphs in Gconn, or whether the weaker bound of 0{n^/k) from Theorem 6.1 is tight. 

We show that under smoothing, the dynamic star is in fact not the worst case: a lower bound of 
/^/k) holds for the lollipop graph. The lollipop is a famous example of graph in which the 
hitting time is large: there are nodes u and v such that H{u,v) = 0(n^) (see, e.g., [12]). Here we 
will use it to prove a lower bound on the hitting time of dynamic graphs under smoothing: 

Theorem 6.3. There is a dynamic graph TL € Gconn o.nd nodes u, v such that Hk{u, v) > {^/k In n)) 

for all k < n/16 (where smoothing is with respect to Qconn)- 

Proof. In the lollipop graph = {V,E), the vertex set is partitioned into two pieces Vi and V 2 
with I Hi I = jl^l = n/2. The nodes in Li form a clique (i.e. there is an edge between every two 
nodes in Hi), while the nodes in V 2 form a path (i.e., there is a bijection vr : [n/2] —)• H 2 such that 
there is an edge between 7r(i) and 7r(i + 1) for all i G [(n/2) — 1]). There is also a single special node 
V* G Hi which has an edge to the beginning of the H 2 path, i.e., there is also an edge {u*,7r(l)}. 

The dynamic graph TL we will use is the dynamic lollipop: Gi = Ln for all i > 1. Let u 
be an arbitrary node in Hi, and let v = 7r(n/2) be the last node on the path. We claim that 
Hk{u,v) > H(n^/^/(-\/fclnn)). 

We will first define the notion of a phase. In an arbitrary walk on TL under A:-smoothing, a phase 
is a maximal time interval in which every node hit by the walk is in V 2 and all edges traversed are 
in Ln (i.e., none of the randomly added smoothed edges are traversed). The starting point of a 
phase is the first vertex contained in the phase. 

Let F = {w G H 2 : 7r“^('u;) > (n/2 — c-^/n/felnn)} for some constant c that we will determine 
later. In other words, F is the final interval of the path of length t = cyn/klnn. We divide into 
two cases, depending on whether v = 7r(n/2) is first hit in a phase that starts in F. We prove that 
in either case, the (conditional) hitting time is at least /{y/k\ogn)). Clearly this implies the 

theorem. 
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Case 1: Suppose that v is first hit in a phase with starting point outside of F. Consider such a 
phase (one with starting point outside of F). Then by Lemma 4.1 and the fact that the degree of 
every node in V 2 is at most 2, the probability at each time step that the phase ends due to following 
a smoothed edge is at least ^ for some constant a > 0. Thus the probability that a 

phase lasts more than £ = % Inn steps is at most (1 — = 


n 


—ab 


Now suppose that the phase last less than ^ Inn steps. A standard Hoeffding bound implies 
that for all (! < i = (6n/A:)lnn, the probability that a walk of length is at a node more than 

I - —2c^{n/A;) In^ n 

\F\ = cyn/fclnn away from the starting point of the phase is at most /« = g ji , 

Now a simple union bound over all F < I implies that the probability that the random walk hits v 

— 2c^{n/k)\rP‘n 

during the phase (conditioned on the phase lasting at most £ steps) is at most £ • e ? = 

^Inn-e" 


< 61 nn-n-( 2 --i)/^ 


To put these bounds together, let A be the event that the random walk hits v during the phase, 
and let B be the event that the phase lasts more than i steps. Then 


Pr[A] = Pr[A|S] Pr[.B] + Pr[A|B] Pr[5] < Pr[5] + Pr[A|S] < n"“^ + 61nn • 


For any constant a, if we set b = 4ja and c = \/2b + (1/2) then we get that Pr[A] < 
0(n“^lnn) < 0{n~^). Hence the expected number of phases starting outside F that occur before 
one of them hits F is n(n^), and thus the hitting time from u to u (under fc-smoothing) conditioned 
on V first being hit in a phase beginning outside of F is Q{n^). 


Case 2: Suppose that v is hrst hit in a phase with starting point in F. We will show that the 
hitting time is large by showing that the expected time outside of such phases is large. We define 
two random variables. Let At be the number of steps between the end of phase t — 1 and the 
beginning of phase t, and let Bt be an indicator random variable for the event that the first t — \ 
phases all start outside of F (where we set i?i = 1 by definition). Then clearly the hitting time 
from u to V, conditioned on v being first hit in a phase with starting point in F, is E ^tBt]- 

Since At and Bt are independent, this is equal to E[At]E[i?f]. 

A phase begins in one of two ways: either following a randomly added smoothed edge into V 2 
(from either Vi or V 2 ), or following the single edge in the lollipop from Vi to 7r(l). If it begins by 
following a smoothed edge, then the starting point is uniformly distributed in V 2 . Since 7r(l) 0 F, 

this clearly implies that Fi[Bt\ > ^1 — 

To analyze E[Ai], again note that phase t — 1 ends by either following a smoothed edge or 
walking on the lollipop from V 2 to Vi. Since the other endpoint of a smoothed edge is distributed 
uniformly among all nodes, the probability that phase t — 1 ended by walking to Vi is at least 1/2. 
So E[At] is at least 1/2 times the expectation of At conditioned on phase t — 1 ending in Vi. This 
in turn is (essentially by definition) just 1/2 times the expected time in Vi before walking to V 2 . So 
consider a random walk that is at some node in Vi. Clearly the hitting time to 7r(l) without using 
smoothed edges is 11 (n^) (we have a 1/n^ chance of walking to v* and then to '7r(l)). The other way 
of starting a phase would be to follow a smoothed edge to 1^. By Lemma 4.3 the probability at each 
time step that the random walk is incident on a smoothed edge with other endpoint in V 2 is at most 
0{kjn). Since the degree of any node in Vi under fc-smoothing is with high probability Q(n), the 
probability that we follow a smoothed edge to V 2 if one exists is only 0(l/n), and hence the total 
probability of following a smoothed edge from Vi to V 2 is at most 0{k/n?). Thus E[At] > VL{'n?/k). 

So we get an overall bound on the hitting time (conditioned on v being first hit by a phase 
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Thus in both cases the hitting time is at least O ’ completing the proof of the theorem. 

□ 


If we do not insist on the dynamic graph being connected at all times, then in fact Theorem 6.1 
is tight via a very simple example: a clique with a single disconnected node. 

Theorem 6.4. There is a dynamic graph TL and vertices u,v such that Hk{u,v) > Tt{n^/k) for all 
k <n where smoothing is with respect to Gall- 

Proof. Let each Gi be the same graph: a clique on re — 1 vertices and a single node v of degree 0. 
Let u be an arbitrary node in the clique, and consider a random walk starting from u. At each time 
t, the probability that an edge exists from the current location of the walk to v is 0{k/n^). If such 
an edge does exist, then the random walk follows it with probability l/(re — 1). Hence the total 
probability of moving to v at any time is only 0{k/n^), and thus the hitting time is Q{n^/k). □ 


7 Aggregation 

Here we consider the aggregation problem in the pairing dynamic network type. Notice, in our study 
of flooding and random walks we were analyzing the behavior of a specific, well-known distributed 
process. In this section, by contrast, we consider the behavior of arbitrary algorithms. In particular, 
we will show the pessimistic lower bound for the aggregation problem for 0 -smoothed pairing graphs 
from [4], holds (within constant factors), even for relatively large amounts of smoothing. This 
problem, therefore, provides an example of where smoothing does not help much. 

The Aggregation Problem. The aggregation problem, first defined in [4], assumes each node 
u € V begins with a unique token a[u]. The execution proceeds for a hxed length determined by the 
length of the dynamic graph. ^ At the end of the execution, each node u uploads a set (potentially 
empty) 7 [re] containing tokens. An aggregation algorithm must avoid both losses and duplications 
(as would be required if these tokens were actually aggregated in an accurate manner). Formally: 

Definition 7.1. An algorithm A is an aggregation algorithm if and only if at the end of every 
execution of A the following two properties hold: 

(1) No Loss: (^) Duplication: Vre,u G H, re 7 ^ u : 7 [re] n 7 [re] = 0. 

To evaluate the performance of an aggregation algorithm we introduce the notion of aggregation 
factor. At at the end of an execution, the aggregation factor of an algorithm is the number of nodes 

^This is another technical difference between the study of aggregation and the other problems considered 
in this paper. For flooding and random walks, the dynamic graphs were considered to be of indefinite size. 
The goal was to analyze the process in question until it met some termination criteria. For aggregation, 
however, the length of the dynamic graph matters as this is a problem that requires an algorithm to aggregate 
as much as it can in a fixed duration that can vary depending on the application scenario. An alternative 
version of this problem can ask how long an algorithm takes to aggregate to a single node in an infinite length 
dynamic graph. This version of the problem, however, is less interesting, as the hardest case is the graph 
with no edges, which when combined with smoothing reduces to a standard random graph style analysis. 
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that upload at least one token (i.e., \{u G V : 7 [tt] / 0}|). Because some networks (e.g., a static 
cliques) are more suitable for small aggregation factors than others (e.g., no edges in any round) 
we evaluate the competitive ratio of an algorithm’s aggregation factor as compared to the offline 
optimal performance for the given network. 

The worst possible performance, therefore, is n, which implies that the algorithm uploaded from 
n times as many nodes as the offline optimal (note that n is the maximnm possible value for an 
aggregation factor). This is only possible when the algorithm achieves no aggregation and yet an 
offline algorithm could have aggregated all tokens to a single node. The best possible performance is 
a competitive ratio of 1 , which occurs when the algorithm matches the offline optimal performance. 

Results Summary. In [4], the authors prove that no aggregation algorithm can guarantee better 
than a fl(n) competitive ratio with a constant probability or better. In more detail: 

Theorem 7.2 (Adapted from [4]). For every aggregation algorithm A, there exists a pairing graph 
FL such that with probability at least 1 / 2 ; A’s aggregation factor is Il(n) times worse than the offline 
optimal aggregation factor in FL. 

Our goal in the remainder of this section is to prove that this strong lower bound persists even 
after a significant amount of smoothing (i.e., k = 0(n/log^n)). We formalize this result below 
(note that the cited probability is with respect to the random bits of both the algorithm and the 
smoothing process): 

Theorem 7.3. For every aggregation algorithm A and smoothing factor k < n/(32 • log^n), there 
exists a pairing graph FL such that with probability at least 1/2; A’s aggregation factor is 0(re) 
times worse than the offline optimal aggregation factor in a k-smoothed version ofFL (with respect 
to Gpair)- 

7.1 Lower Bound 

Here we prove that for any smoothing factor k < (cn )/ log^ n (for some positive constant fraction 
c we fix in the analysis), /c-smoothing does not help aggregation by more than a constant factor as 
compared to 0-smoothing. To do so, we begin by describing a probabilistic process for generating a 
hard pairing graph. We will later show that the graph produced by this process is likely to be hard 
for a given randomized algorithm. To prove our main theorem, we will conclude by applying the 
probabilistic method to show this result implies the existence of a hard graph for each algorithm. 

The a-Stable Pairing Graph Process. We define a specific process for generating a pairing 
graph (i.e., a graph in allowed{Qpair))- The process is parameterized by some constant integer a > 1. 
In the following, assume the network size n = 2£ for some integer ^ > 1 that is also a power of 2f 
For the purposes of this construction, we label the 2i nodes in the network as ai,bi, o, 2 ,b 2 , ■■■, o,i,b£- 
For the hrst ol rounds, our process generates graphs with the edge set: {{ai,bi) A < i < i} . After 
these rounds, the process generates i bits, qi,q 2 , with uniform randomness. It then dehnes a 
set S of selected nodes by adding to S the node a* for every i such that qi = and adding hi for 
every i snch that qi = 1. That is, for each of our (a*, hi) pairs, the process randomly flips a coin to 
select a single element from the pair to add to S. 

For all graphs that follow, the nodes not in S will be isolated in the graph (i.e., not be matched). 
We turn our attention to how the process adds edges between the nodes that are in S. To do so, 

^We can deal with odd n and/or i not a power of 2 by suffering only a constant factor cost to our final 
performance. For simplicity of presentation, we maintain these assumptions for now. 
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it divides the graphs that follow into phases, each consisting of a consecutive rounds of the same 
graph. In the first phase, this graph is the one that results when the process pairs up the nodes in 
S by adding an edge between each such pair (these are the only edges). In the second phase, the 
process dehnes a set S 2 that contains exactly one node from each of the pairs from the hrst phase. 
It then pairs up the nodes in ^2 with edges as before. It also pairs up all nodes in 5 \ ^2 arbitrarily. 
Every graph in the second phase includes only these edges. In the third phase, the process defines 
a set Ss containing exactly one node from each of the S 2 pairs from the previous pairs. It then 
once again pairs up the remaining nodes in S arbitrarily. The process repeats this procedure until 
phase t = log 2 IS”! at which point only a single node is in St, and we are done. 

The total length of this dynamic graph is Q:(log 2 (|5'|) + 1). It is easy to verify that it satisfies 
the definition of the pairing dynamic network type. 

Performance of the Offline Optimal Aggregation Algorithm. We now show that the even 
with lots of smoothing, a graph generated by the stable pairing graph process, parameterized with 
a sufficiently large a, yields a good optimal solution (i.e., an aggregation factor of 1). 

Lemma 7.4. For any k < nj32, and any pairing graph % that might he generated by the (logre)- 
stahle pairing graph process, with high probability in n: the offline optimal aggregation algorithm 
achieves an aggregation factor of 1 in a k-smoothed version of 71. 

Proof. In the following, let a = logn be the stability parameter provided the graph process. We 
will describe an offline algorithm that guarantees for every possible dynamic graph produced by 
the a-stable pairing graph process, that with high probability (over the smoothing choices) the 
algorithm aggregates all values to a single node. It follows, of course, that the offline optimal 
algorithm achieves this same optimal factor. 

To begin our argument, consider some dynamic graph generated by the a-stable pairing process. 
We divide its graphs into phases of a consecutive static graphs each. Recall from our definition 
of the graph process that during each phase, some nodes are paired and some nodes are isolated. 
It also follows from our dehnition of this process that the graphs within a given phase are all the 
same. 

Fix some phase i and some edge (u, v) in that phase’s graph. Our key observation is that this 
edge is included in at least one of the smoothed graphs during this phase (i.e., there is at least one 
round where the smoothing does not remove this edge), with high probability in n. To prove this 
observation, we first focus on a single round in this phase and show there is at least a constant 
probability that {u, v) is left alone in this round. To prove this result we apply Lemma 4.4 to S = M 
(where M is the set of matched nodes in this round), 5 = 2, and node u. (We obtain our 5 value by 
noting that, by definition, the set of matched nodes in every round of every graph produced by our 
adversary includes at least half the nodes.) The probability that (u,v) is removed in a our round 
is less than or equal to the probability pu that u is affected in the round. Lemma 4.4 tells us that 
Pu < {4:k)/n. Given our assumption that k < n/32, it follows that pu < 4/32 = 1/8. Therefore, 
the probability that {u, v) is removed by smoothing in every round of phase i, is no more than: 

K = (l/8)^°g^ = (l/23)i°g^ = l/n^. 

To complete the proof, we apply two union bounds to obtain this property for all matched edges 
in all phases, while avoiding dependencies. First, within a given phase, a union bound provides 
that with probability 1/n^, all matched edges (of which there are less than n) in the original graph 
are preserved at least once. Another union bound for the total number of phases (which is also less 
than n), provides that this holds for very phase. It is then simple to verify from the definition of 
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the smoothing process that it is possible to aggregate every value to the single node that remains 
in the St (i.e., the root of the tree induced by the phases greater than 1 ). □ 

Performance of an Arbitrary Distributed Aggregation Algorithm. We now fix an ar¬ 
bitrary distributed aggregation algorithm and demonstrate that it cannot guarantee (with good 
probability) to achieve a non-trivial competitive ratio in all pairing graphs. In particular, we will 
show it has a constant probability of performing poorly in a graph generated by our above process. 

Lemma 7.5. Fix an online aggregation algorithm A and smoothing factor k < n/(32 • log^n). 
Consider a k-smoothed version of a graph Fi generated by the {logn)-stable pairing graph process. 
With probability greater than 1/2 (over the smoothing, adversary, and algorithm’s independent 
random choices): A has an aggregation factor in n(n) when executed in this graph. 

Proof. Consider the pairing graph generated by the stable pairing graph process, before the smooth¬ 
ing is applied. Let S be the set of selected nodes (see the process definition) and S = V \ S he the 
non-selected nodes. By dehnition, |5| = [S'! = n/2. 

Let Y be the number of times that nodes in S are affected by smoothing in the s = 0(log^ n) 
individual graphs generated by the (log n)-stable pairing graph process. To calculate this ex¬ 
pectation, let Xi^r be the indicator random variable that is 1 if the i^^ node in S (by some 
arbitrary but hxed ordering) is affected by smoothing in the graph generated by the pro¬ 
cess, and otherwise 0. It follows that Y = for all i G [|5|] and r G [s], and therefore 

E(y) = E HX^,r) = E P^{Xi,r = 1). 

We can upper bound Pr(Aj^r = 1), for any particular i and r, by applying Lemma 4.4 to the 
set S of unmatched nodes and <5 = 2 . It follows that: 

Pr{Xi^r = 1) < {2 ■ 6 ■ k)/n = {Fk)/n < 4/(32 log^ n) = 1/(8log^ n). 

Because we are summing over |.S| • s < (n/2) log^ n values, we obtain the bound E(y) < n/16. We 
now apply Markov’s inequality to establish that T > n/4 with probability no more than 1/4. Let 
U denote the set of nodes in S that remain undistributed by smoothing. Notice, we just proved 
that probability at least 3/4, |17| > n/4. 

We now want to consider what happens to nodes in U. Fix some x G U. Let y G S he the 
node connected to x throughout phase 1 in the graph generated by the graph process. Because 
X G U, it follows that this edge is undisturbed throughout phase 1. It follows that x and y have 
no opportunity to learn new tokens or to pass on their existing tokens outside the pair during 
phase 1. A key property proved in [4] is that at the end of phase 1 we can assign owners to x and 
y’s tokens among x and y. To do so, consider what would happen if we extend this graph such 
that going forward x and y are isolated. It must be the case that 7 [x] U ^[y] = {o'lic], o')?/]} and 
'y[x]r\'y[y] =0. If these properties did not hold in this extension, then no duplication and/or no 
loss would be violated in this extension, and the aggregation problem requires these properties to 
always hold. It follows that at least one of these two nodes has a non-empty 7 set at the end of 
this isolation extension. Because the graph process selected which node went to S from among 
this pair with uniform and independent randomness, the probability that the node not chosen for 
S ends up owning at least one token (by our above definition of ownership), is at least 1 / 2 . Put 
another way, for each node in U, with independent probability at least 1/2, that node will end up 
outputting at least one token at the end of the execution. We expect that at least half the nodes in 
U will output tokens. Another application of Markov’s tells us that the probability that less than 
a smaller constant fraction of these nodes end up uploading is less than 1/4. 
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Pulling together the pieces, we first proved that the probability that \U\ < n/4 is at most 1/4. 
We then proved that the probability that less than \U\/j nodes end up uploading, for some constant 
j, is also at most 1/4. A union bound provides that the probability at least one of these bad events 
occurs is strictly less than 1/2. In the case that neither bad event occurs (a case that holds with 
probability strictly more than 1/2), we end up with an aggregation factor in (l/j)(n/4) G n(n)—as 
required by the lemma statement. □ 

A hnal union bound combines the results from Lemmas 7.4 and 7.5 to get our final corollary. 
Applying the probabilistic method to the corollary yields the main theorem—Theorem 7.3. 

Corollary 7.6. Fix an aggregation algorithm A and smoothing factor k < n/ (32 • log^ n). There is 
a method for probabilistically constructing a pairing graph TL, such that with probability greater than 
1/2 (over the smoothing, adversary, and algorithm’s independent random ehoiees): A’s aggregation 
faetor in a k-smoothed version of TL is n(n) times larger than the offline optimal faetor for this 
graph. 
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